The Application of Quantitative Metabolomics for the Taxonomic Differentiation of Birds

Simple Summary Modern evolutionary biology offers a wide variety of methods to explore the evolution of species and to describe their relationships. The methods of DNA/RNA sequence analysis have been developing for decades and have become increasingly popular and reasonably reliable. Nevertheless, final phylogenetic trees for many taxa are still under debate because both classical and genomics-based approaches have their own limitations for phylogenetic tree reconstruction. Here, we propose the use of younger ‘omics’ methods, namely quantitative metabolomics, to aid the phylogeny reconstruction of vertebrates. We show that metabolomics-based hierarchical clustering analysis trees match, although not perfectly, to the genomics-based trees. Abstract In the current pilot study, we propose the use of quantitative metabolomics to reconstruct the phylogeny of vertebrates, namely birds. We determined the concentrations of the 67 most abundant metabolites in the eye lenses of the following 14 species from 6 orders of the class Aves (Birds): the Black kite (Milvus migrans), Eurasian magpie (Pica pica), Northern raven (Corvus corax), Eurasian coot (Fulica atra), Godlewski’s bunting (Emberiza godlewskii), Great crested grebe (Podiceps cristatus), Great tit (Parus major), Hawfinch (Coccothraustes coccothraustes), Hooded crow (Corvus cornix), House sparrow (Passer domesticus), Rock dove (Columba livia), Rook (Corvus frugilegus), Short-eared owl (Asio flammeus) and Ural owl (Strix uralensis). Further analysis shows that the statistical approaches generally used in metabolomics can be applied for differentiation between species, and the most fruitful results were obtained with hierarchical clustering analysis (HCA). We observed the grouping of conspecific samples independently of the sampling place and date. The HCA tree structure supports the key role of genomics in the formation of the lens metabolome, but it also indicates the influence of the species lifestyle. A combination of genomics-based and metabolomics-based phylogeny could potentially resolve arising issues and yield a more reliable tree of life.


Introduction
To explore the evolution of species and to describe their relationships, traditional taxonomy, for a long time (since Linnaeus and Darwin in the 18-19th centuries), was based on systemically analyzing the traits of species with respect to their morphology, physiology and behavior [1]. Currently, evolutionary biologists obtain most of their knowledge of the phylogeny of organisms through the cladistic analysis of morphological characters [2], which has been reinforced at the end of the 20th century by a wide variety of methods based on the analysis of nucleotide sequences [3]. Advances in genomics and the development of DNA/RNA sequence databases as well as computational bioinformatics algorithms have purpose. The lens is the transparent tissue of an individual, which is anatomically isolated from other tissues [32]. To be transparent, the lens tissue consists of densely packed fiber cells without nuclei and organelles, and the metabolic activity inside the lens is minimal. For this reason, the protection of the lens tissue from oxidative, osmotic and other stresses is almost entirely provided by metabolites [33,34]. The metabolite exchange between blood and the lens proceeds via aqueous humor, and the composition of the majority of metabolites in the lens reflects the smoothed in time metabolomic profile of the whole body [32]. At the same time, some important compounds, including antioxidants, osmolytes and energy metabolites, are synthesized in the metabolically active lens epithelial monolayer [32], and the composition of these metabolites indicates the lens-specific requirements related to the animal genetic features, lifestyle and feeding behavior [33,35,36].
This paper is a pilot interdisciplinary study directed at the possibility of using quantitative metabolomics and corresponding conventional unsupervised statistical approaches (principal component analysis and hierarchical clustering analysis) for the differentiation between species of the Animalia kingdom in comparison with their phylogenetic relationships (i.e., phylometabolomics [37]). To this end, we determined the concentrations of the most abundant metabolites in the eye lenses of 14 species from 6 orders of the class Aves (Birds): Accipitriformes, Columbiformes, Gruiformes, Passeriformes, Podicipediformes and Strigiformes. Our preliminary results have shown that the topology of metabolomics-based dendrograms resemble a phylogenetic tree. It was interesting to compare the topology of metabolomics-based and genomics-based dendrograms. For better representation, the current study includes both genetically close and distant bird species. In particular, eight species, including the Eurasian magpie (Pica pica), Northern raven (Corvus corax), Hooded crow (Corvus cornix), Rook (Corvus frugilegus), Godlewski's bunting (Emberiza godlewskii), Great tit (Parus major), Hawfinch (Coccothraustes coccothraustes) and House sparrow (Passer domesticus), belong to the Passeriformes order. Four out of these eight species (P. pica, C. corax, C. cornix and C. frugilegus) correspond to the Corvidae family, and the other four correspond to the Passerides infraorder. Other birds, including the Black kite (Milvus migrans), Eurasian coot (Fulica atra), Great crested grebe (Podiceps cristatus), Rock dove (Columba livia), Short-eared owl (Asio flammeus) and Ural owl (Strix uralensis), represent non-passerine orders. All species under analysis are rather widespread in Siberia (Russia), and the samples were possible to obtain.
At the molecular level, the adaptations of species are imprinted in all layers of the functional organization of life: the genome, transcriptome, proteome and metabolome. Among these layers, the genome is the most conservative, whereas the metabolome is the most susceptible to the variability of many external and internal factors. Consequently, the composition and the concentrations of metabolites in the lens, or the lens metabolome, are determined by the combination of conservative (e.g., genomic) and variable (e.g., lifestyle and feeding) factors and thus is expected to vary to some extent between species. It is interesting to establish the magnitude of such variations and to evaluate the contributions of conservative and variable factors into the formation of the total metabolome of the lens.
The major goals of the current study are: (1) to identify the most abundant metabolites in the lenses of 14 bird species from 6 orders and to establish their concentrations; (2) to analyze the metabolomic composition of bird lenses; (3) to assess the applicability of the generally used statistical metabolomic approaches to the phylogenetic differentiation between species and to compare the genomics-and metabolomics-based trees; and (4) to evaluate the influence of conservative and variable factors on the total metabolome of the lens.

Chemicals
Chloroform and methanol HPLC grade were purchased from Panreac (Barcelona, Spain). D 2 O 99.9% was purchased from Armar Chemicals (Döttingen, Switzerland). Acetonitrile HPLC grade of quality 0 was purchased from Cryochrom (St. Petersburg, Russia). All other chemicals were of the highest purity, usually HPLC or LC-MS grade, and were purchased from Sigma-Aldrich (Burlington, MA, USA). H 2 O was deionized using an Ultra Clear UV plus TM water system (SG water, Barsbüttel, Germany) to the quality of 18.2 MOhm.

Lens Sample Collection and Species Description
The study was conducted in accordance with the ARVO Statement for the Use of Animals in Ophthalmic and Vision Research and the European Union Directive 2010/63/EU on the protection of animals used for scientific purposes, and with the ethical approval from the International Tomography Center SB RAS (ECITC-2017-02).
The bird species were collected from three sources ( The species under analysis are common in Siberia, and the samples were relatively accessible. All species were adults of wild origin and were obtained within the 2017-2020 time period. Lenses were extracted clean from the bodies, placed into separate cryotubes, frozen in liquid nitrogen and kept at −70 • C until the analysis within the 2017-2021 time period. Depending on the size of a lens, each sample contained one to three lenses from different individuals (Table 1).

Sample Preparation
The lens sample preparation was performed by the procedure that has been described in detail [36,39,40]. We analyzed either one lens per sample or pooled together two or three lenses from different individuals. Each sample was weighed prior to homogenization. The typical sample and lens weights are given in Table 1. Lenses were placed in glass vials and homogenized with a TissueRuptor II homogenizer (Qiagen, Netherlands) in 1600 µL of cold (−20 • C) MeOH, and then 800 µL of water and 1600 µL of cold chloroform were added. The mixture was shaken well in a shaker for 20 min and was left at −20 • C for 30 min. Then, the mixture was centrifuged at 16,100× g, +4 • C for 30 min, yielding two immiscible liquid layers separated by a protein layer. The upper aqueous layer (MeOH-H 2 O) was collected, divided into two parts for NMR (2/3) and LC-MS (1/3) analyses and lyophilized.

NMR Measurements
The extracts for the NMR measurements were re-dissolved in 600 µL of D 2 O containing 2 × 10 −5 M sodium 4,4-dimethyl-4-silapentane-1-sulfonic acid (DSS) as an internal standard and 50 mM deuterated phosphate buffer to maintain a pH of 7.2. The 1 H NMR measurements were carried out at the Center of Collective Use "Mass spectrometric investigations" SB RAS with the use of an NMR spectrometer AVANCE III HD 700 MHz (Bruker BioSpin, Ettlingen, Germany) equipped with a 16.44 Tesla Ascend cryomagnet, as described in [39]. The concentrations of the metabolites in the samples were determined by the peak area integration relative to the internal standard DSS.

LC-MS Measurements
In this work, LC-MS data were used only for the identification of several unknown compounds and for the confirmation of the data obtained by the NMR method. The extracts for the LC-MS analysis were re-dissolved in 100 µL of 10 mM ammonium formate and 0.1% formic acid solution in H 2 O. The LC separation was performed with an UltiMate 3000RS chromatograph (Dionex, Bremen, Germany) using hydrophilic interaction liquid chromatography (HILIC) on a TSKgel Amide-80 HR (Tosoh Bioscience, Griesheim, Germany) column (4.6 × 250 mm, 5 µm). The MS detection was performed with an ESI-Q-TOF high-resolution hybrid mass spectrometer maXis 4G (Bruker Daltonics, Bremen, Germany), as described earlier [34,40].

Metabolomic Data Analysis
The statistical treatment of the quantitative metabolomics data-principal component analysis (PCA), hierarchical cluster analysis (HCA) and heatmap construction-was performed at the MetaboAnalyst 5.0 web-platform (www.metaboanalyst.ca (accessed on 10 January 2022) [41]) with the data either non-scaled, auto-scaled (mean-centered and divided by the standard deviation of each metabolite concentration) or Pareto-scaled (mean-centered and divided by the square root of the standard deviation of each metabolite concentration). Auto-scaled and Pareto-scaled methods were used to normalize the contributions of the metabolites in further analyses.
For the construction of dendrograms based on hierarchical clustering, several parameters related to the HCA data treatment need to be chosen: the method of the data normalization (e.g., non-scaled, auto-scaled and Pareto-scaled), the type of the distance\similarity measure (e.g., Euclidian, Spearman and Pearson), and the clustering algorithm (e.g., Ward, Average, Complete and Single). The HCA parameter values listed above are available as built-in options at the widely-used-in-metabolomics web platform MetaboAnalyst v5.0, where the current analysis was performed. It should be noted that other tree-building software or the Python/R statistical packages allow for choosing a wider variety of HCA parameter values (e.g., Manhattan distance, Bray-Curtiss dissimilarity, WPGMA clustering, etc.). However, the current pilot work is aimed at assessing the general applicability of the quantitative metabolomics-based HCA method for phylogeny. Thereby, we decided not to go deep into developing the correct HCA parameters, and thus we used only the built-in MetaboAnalyst ones. Moreover, we did not restrict possible combinations of values. The general guideline for choosing the appropriate HCA parameters is to use ones that maximize the distance between samples in different classes and to minimize that within each class.

Phylogenetic Tree Reconstructions for Birds from the Literature
In a preliminary analysis of the metabolomics-based dendrograms, we observed similarity in the topology with the published phylogenetic dendrograms based on different approaches. To test this hypothesis and to compare the metabolomics-based trees and the "classical" genomics-or transcriptomics-based phylogeny, we manually reconstructed two phylogenetic trees from the literature to fit only the 14 bird species under study. The backbone parts of a tree from the class (Aves) to the orders (Columbiformes, Podicipediformes, Gruiformes, Accipitriformes, Strigiformes, and Passeriformes) were based on two recently proposed modern bird phylogenies, either on work of Jarvis ED et al. [24] or on the newer work of Kuhl H et al. [27]. The vast order Passeriformes includes 8 out of 14 species under investigation, and a more detailed tree for this order was adapted from the work of Oliveros CH et al. [42]. To this purpose, we removed the clades not containing the species under study and connected the remaining branches keeping the tree structure. The time scale and the time of divergence were discarded, and the resulting tree was rather schematic for displaying the phylogenetic relationships.

The Identification and Quantification of Metabolites
The identification of metabolites was performed according to their NMR spectra that are available in the literature, in databases (HMDB, METLIN, BMRB and SpectraBase) and in our in-house NMR library [33,35,39]. In some cases, when NMR signal assignment was not obvious, we spiked the lens extract with commercial standard compounds and validated the identification of metabolites. For the identification of unknown signals, we also used the fractioning of the metabolomic extract by HPLC followed by the MS and NMR analysis of each fraction, as described in [40]. Nevertheless, several signals in NMR spectra remained unidentified. These signals, together with the identified metabolites, are included in Table 2. They are annotated as S109, S112, S120, D121, D139, T727 and S823 (Supplementary Figure S1), where the letter indicates the multiplicity of the signal (S for singlet, D for doublet and T for triplet), and the numerals show the chemical shift of the signal (for example, S109 corresponds to the singlet at 1.09 ppm).
The concentrations of the metabolites in the lenses (in nmol/g) were calculated by the integration of the NMR signals relative to the internal standard DSS followed by normalization to the tissue wet weight. Typically, 60-80 compounds were identified for every species; however, the NMR signals from some compounds were either too weak or strongly overlapped by other signals, which made the quantification of these compounds unreliable. For that reason, the final set of metabolites studied in this work was restricted to 67 identified compounds and 7 unknowns. For every species, except A. flammeus and S. uralensis, the measurements were performed for 3-12 individuals (Table 1,  Supplementary Table S1), and the results were averaged. For the rare species A. flammeus and S. uralensis, the lenses from only one individual were analyzed. The obtained data expressed as the mean ± standard deviation (SD) are collected in Table 2. The relative abundances of 10 major (in average) metabolites in the lenses of 14 species under study are shown in Figure 1. In Supplementary Table S2, the most abundant metabolites from Table 2 are sorted in descending order and are highlighted in each species separately to facilitate further analysis: a red color indicates a metabolite concentration above 10 µmol/g, a yellow color indicates a concentration between 3 and 10 µmol/g and a green color indicates a concentration between 1 and 3 µmol/g.     1 Lens from only one individual was analyzed. 2 Low metabolite identification confidence, not confirmed by chemical standards. Concentrations of unknowns were estimated assuming that the signals in the aliphatic part of NMR spectra (S109, S112, S120, D121 and D139) correspond to a single CH 3 group, whereas the aromatic signals (T727 and S823) correspond to a CH group. 3 Abbreviations: 2-OH-3-Me-but-2-hydroxy-3-methyl-butyrate; 2-OH-but-2-hydroxy-butyrate; 3-Me-His-3-methylhistidine; 3-OH-but-3hydroxy-butyrate; 3-OH-isovalerate-3-hydroxy-isovalerate; ADP-adenosine diphosphate; alpha-Aminobut-alpha-aminobutyrate; alpha-OH-isobut-alpha-hydroxy-isobutyrate; AMP-adenosine monophosphate; ATP-adenosine triphosphate; Gl-Ph-Choline-glycerophosphocholine; GSSG-glutathione oxidized; GTP-guanosine triphosphate; N,N-DMG-N,N-dimethylglycine; NAA-N-acetyl-aspartate; NAD-nicotinamide adenine dinucleotide; NADH-nicotinamide adenine dinucleotide reduced; NADPH-nicotinamide adenine dinucleotide phosphate reduced; N-Me-His-N-methylhistidine; Ph-Choline-phosphocholine; UDP-uridine diphosphate. A list with the metabolite full names and ChEBI identifiers is provided in Supplementary Table S1.
abundances of 10 major (in average) metabolites in the lenses of 14 species under study are shown in Figure 1. In Supplementary Table S2, the most abundant metabolites from  Table 2 are sorted in descending order and are highlighted in each species separately to facilitate further analysis: a red color indicates a metabolite concentration above 10 µ mol/g, a yellow color indicates a concentration between 3 and 10 µ mol/g and a green color indicates a concentration between 1 and 3 µ mol/g.

General Overview of Bird Lens Metabolomes
There were two principal metabolites in the bird lenses with an average concentration above 20 µ mol/g: myo-inositol and taurine. In the lens of A. flammeus, the concentration of lactate was also slightly above 20 µ mol/g. Both myo-inositol and taurine are well-known osmolytes [34,35,43,44], protecting lens fiber cells from osmotic stress. In all species except P. cristatus, S. uralensis and A. flammeus, both osmolytes share the

General Overview of Bird Lens Metabolomes
There were two principal metabolites in the bird lenses with an average concentration above 20 µmol/g: myo-inositol and taurine. In the lens of A. flammeus, the concentration of lactate was also slightly above 20 µmol/g. Both myo-inositol and taurine are well-known osmolytes [34,35,43,44], protecting lens fiber cells from osmotic stress. In all species except P. cristatus, S. uralensis and A. flammeus, both osmolytes share the function of osmotic protection and have very high abundance above 10 µmol/g (Tables 2 and S2). In the lenses of F. atra, C. corax, P. pica, M. migrans, C. cornix, C. frugilegus and C. livia, myo-inositol has a higher abundance than taurine (up to 2.5 times); in P. cristatus, myo-inositol prevails tenfold over taurine. In all four Passerides (C. coccothraustes, E. godlewskii, P. domesticus and P. major), taurine has a higher abundance than myo-inositol (up to 1.9 times), and in both Strigidae (S. uralensis and A. flammeus), taurine prevails 5-10 times over myo-inositol.
Eight further metabolites had average concentrations below 10 and above 3 µmol/g, namely, lactate, glutamine, acetate, glutathione, alanine, ergothioneine, serine and ATP. These compounds can be ascribed to the major metabolites of bird lenses (Supplementary Table S2). High levels of these metabolites indicate their important roles in cellular processes. In living cells, including metabolically inert lens fiber cells, they are involved in various functioning: osmotic protection (myo-inositol, taurine, glutamine and serine); antioxidant protection (glutathione and ergothioneine); and cellular energy generation (ATP, acetate, glutamine and alanine).
The composition of the most abundant metabolites vary from species to species, and the most pronounced differences in comparison with other species are observed for both birds from the Strigidae family, A. flammeus and S. uralensis, and for P. cristatus (Figure 1). It should be also noted that the levels of major metabolites in all passerine species are rather similar (Figure 1, Supplementary Table S2), indicating the importance of the genetic factor in the formation of the tissue metabolome.

Principal Component Analysis (PCA)
PCA is the method of choice in metabolomics to take a glance at the obtained results and to display general similarities and differences in the data. Figure 2a shows a PCA scores plot based on auto-scaled data for all bird species studied in this work.
The composition of the most abundant metabolites vary from species to species, and the most pronounced differences in comparison with other species are observed for both birds from the Strigidae family, A. flammeus and S. uralensis, and for P. cristatus (Figure 1). It should be also noted that the levels of major metabolites in all passerine species are rather similar (Figure 1, Supplementary Table S2), indicating the importance of the genetic factor in the formation of the tissue metabolome.

Principal Component Analysis (PCA)
PCA is the method of choice in metabolomics to take a glance at the obtained results and to display general similarities and differences in the data. Figure 2a shows a PCA scores plot based on auto-scaled data for all bird species studied in this work.  One can see that the conspecific samples are grouped together in the plot, although they are often not clearly separated from other species along the first two principal components (PC1 and PC2). It is worth noting that, independently of the sampling place and date (e.g., C. corax, Table 1), the grouping of conspecific samples is observed. Moreover, such a grouping is observed in the PCA plots independently of the data scaling (Supplementary Figure S2, for non-scaled and Pareto-scaled PCA). The most distant species in the plot (Figure 2a) are A. flammeus, S. uralensis and P. cristatus. All birds from the Passeriformes order form a cluster at the upper left part of the plot, and inside this cluster, the groups of species belonging to the Corvides and the Passerides infraorders are visibly separated. One of the main advantages of PCA is that the analysis is unsupervised. However, it works well for a limited number of groups of samples. When the number of groups increases, the discrimination between groups (if it exists) may start to vanish in the PC1 vs. PC2 plot. In this case, the search for the discrimination between groups can include the use of additional dimensions with subsequent principal components (PC3, PC4, etc.) or reductions in the number of groups by removing distant groups. In the present work, we performed PCA for 67 samples from 14 groups (Figure 2, Supplementary Figure S2). The grouping of conspecific samples is rather good, but the discrimination between groups in the PC1 vs. PC2 plot is unreliable. For better discrimination of unresolved species, we removed phylogenetically distant species and left the Passeriformes order only (39 samples, 8 groups); the resulting PCA scores plot is presented in Figure 2b. Both infraorders, the Corvides and the Passerides, in Figure 2b, are well-separated. Inside the Passerides infraorder, a separation between species and the grouping of the same species are clearly visible; P. domesticus samples are now the most distant from other Passerides. However, no clear separation is observed for the Corvides.

Hierarchical Clustering Analysis (HCA) and Heatmaps
We performed the analysis for all 36 possible combinations of available HCA parameter values (Supplementary Figure S3). The quality of the obtained dendrograms was monitored according to the following criteria: conspecific samples should be in one cluster and should not mix with other species; the Passerides and the Corvides samples should form larger clusters and link with each other to form the Passeriformes cluster; and P. cristatus and C. livia should be distant from other clusters.
Fairly good quality of dendrograms was obtained for most combinations of the HCA parameters. As an example, the HCA tree plotted with the default set of built-in parameters, with auto-scaled data plotted with the Euclidean distance similarity measure and using Ward's linkage clustering algorithm (AEW), is demonstrated in Figure 3. The quality of the tree is rather high. Conspecific samples are clustered together (except P. pica and C. cornix), and Passerides and the Corvides samples form a larger Passeriformes cluster. Similar to the PCA analysis, the grouping of the conspecific samples is observed independently of other factors, such as age, sex or sampling place and date (e.g., C. corax, Table 1). Similarly, good clustering was obtained with non-scaled and Pareto-scaled data plotted with the Spearman's rank correlation similarity measure and with the use of a singlelinkage clustering algorithm (NSS and PSS, Supplementary Figure S4). Nevertheless, many non-scaled and Pareto-scaled HCA trees show rather poor clustering of conspecific samples and further positioning of larger clusters due to the data scaling.
Data normalization (scaling) is often required to equalize the contributions of highand low-abundant metabolites into analysis; if one uses non-scaled data, the position of the species in the dendrogram is determined by the concentration variation for the several most abundant metabolites only. The exception from this rule is the use of non-scaled data with nonparametric statistics (e.g., Spearman), since the latter discards the information on the concentrations. On the other hand, taking into account the information on metabolite abundance in some specific form can potentially be helpful for phylogeny. For example, for PSS clustering (Supplementary Figure S4), the position of the P. cristatus species is the most correct compared to the other HCA trees, and the samples are the most distant from other clusters.
Heatmaps add a second visual dimension to the HCA analysis. Supplementary Figure S5 shows a heatmap chart constructed for the AEW clustering ( Figure 3). The advantage of this type of result presentation in comparison with the HCA dendrogram is that the additional information is visualized; the metabolites participating in the sample clustering and dendrogram construction are clearly seen in the heatmap. Supplementary Figure S5 shows that approximately twenty compounds (leucine, NADPH, creatine, etc.; upper-left corner) in P. cristatus differ significantly from other species. For C. coccothraustes, two unknowns (S112 and S120) are flaring. For both Strigidae species, 10-11 metabolites (NAA, valine, phenylalanine, etc.; lower-left corner in Supplementary Figure S5) stand out. The F. atra species has an elevated amount of asparagine, anserine and carnosine. Data normalization (scaling) is often required to equalize the contributions of highand low-abundant metabolites into analysis; if one uses non-scaled data, the position of the species in the dendrogram is determined by the concentration variation for the several most abundant metabolites only. The exception from this rule is the use of non-scaled data with nonparametric statistics (e.g., Spearman), since the latter discards the information on the concentrations. On the other hand, taking into account the information on metabolite abundance in some specific form can potentially be helpful for

Genomics-and Transcriptomics-Based Schematic Phylogenetic Tree Construction from the Literature
To test the assumption of similarity in the topology of metabolomics-based HCA dendrograms and the phylogenetic dendrogram of the studied species, we manually constructed two schematic literature-based phylogenetic dendrograms. The first tree, reconstructed from phylogenomic research by Jarvis ED et al. [24] and Oliveros CH et al. [42], is presented in Figure 4a. The second tree is based on newer extensive phylotranscriptomics research from Kuhl H et al. [27] (Figure 4b) These differences are marked with arrows in Figure 4.

Genomics-and Transcriptomics-Based Schematic Phylogenetic Tree Construction from the Literature
To test the assumption of similarity in the topology of metabolomics-based HCA dendrograms and the phylogenetic dendrogram of the studied species, we manually constructed two schematic literature-based phylogenetic dendrograms. The first tree, reconstructed from phylogenomic research by Jarvis ED et al. [24] and Oliveros CH et al. [42], is presented in Figure 4a. The second tree is based on newer extensive phylotranscriptomics research from Kuhl H et al. [27] (Figure 4b). The main difference between Kuhl H et al. and Jarvis ED et al. trees, related to our 14 bird species, is that, in the Kuhl H et al. tree, there is no node between the Columbiformes and the Podicipediformes, which corresponds to the Columbea clade in the Jarvis ED et al. tree. Moreover, the Columbiformes order in the Kuhl H et al. tree forms a cluster with the Gruiformes, forming the Basal landbirds clade. These differences are marked with arrows in Figure 4.

Discussion
It can be safely said that birds rely deeply on their eyesight for living. The optical system of the eye is evolutionarily adapted for the specific needs of a bird [45]. The adaptation factors include, but are not limited to, the habitat, lifespan, feeding behavior, hunting behavior and diurnal or nocturnal lifestyle. It can be expected that adaptations strongly influence the morphology and composition of the optical apparatus, particularly the lens. A significant part of these adaptations is secured at the genomic level, but the feeding behavior and lifestyle of certain bird species may also have a noticeable influence, especially at the metabolomic level.
In modern metabolomics, three general types of data can be distinguished: qualitative, semi-quantitative and quantitative data. Qualitative data either determine the presence of metabolites in a tissue or answer the question of whether the relative content of metabolites is higher or lower between the two groups of samples. The semi-quantitative approach is aimed at the quantitative (numeric) comparison of the relative content of metabolites in different groups of samples. Most often, semi-quantitative data are obtained with the use of electrospray ionization-based mass spectrometry (ESI-MS). The ESI-MS signal intensity is not directly proportional to the metabolite concentration; metabolites with a low concentration but a high ionization ability can give more intense signals than metabolites with a high concentration but a low ionization ability. In addition, matrix effects and ionization suppression can lead to the biased interpretation of data [33,46].
The quantitative approach yields absolute concentrations of metabolites in a tissue (e.g., in moles per gram). It demands much more effort and time than qualitative or semiquantitative methods, but the data obtained have a long-term value. In addition, the results of such experiments are available for future 'eternal' re-use, such as for data mining, new interpretations, the addition of new samples or groups, new comparisons, etc. [47]. That cannot be achieved with ESI-MS semi-quantitative data; it is almost impossible to add new samples to the previous experiments due to the limitations of ESI-MS instruments. For quantitative metabolomics, the method of proton nuclear magnetic resonance spectroscopy ( 1 H NMR) is often used. Peak areas of the metabolite signals in NMR spectra are directly proportional to the compound concentrations, which makes the determination of the metabolite levels in a tissue easy and straightforward. The results of the present work, obtained with the use of NMR-based quantitative metabolomics, demonstrate that the data obtained can be used not only for the comparison of the biological features of different species, but also for animal classification and for studying the evolution of biochemical processes in living nature.
We have found that the composition and concentrations of the most abundant metabolites vary from species to species, but their levels in genetically close species are very similar, which indicates the importance of the genetic factor in the formation of the tissue metabolome. The eye lenses of all birds contain very high concentrations of two metabolites, myo-inositol and taurine, which share the function of lens osmotic protection [33,34,43,44]. The ratio between their concentrations also strongly depends on the species phylogeny; genetically close species have a rather similar ratio (Tables 1 and S1, Figure 1). An especially pronounced shift in this ratio is observed for both of the Strigidae species (taurine abundance is much higher). Earlier in our lab, we found that the lenses of another nocturnal animal, the Rat (R. norvegicus), contain high concentrations of taurine, which prevails almost tenfold over myo-inositol (13.2 and 1.8 µmol/g correspondingly) [33]. It has been also recently suggested [48] that, besides osmotic protection, myo-inositol in the eye lens may act as a chaperone, protecting the lens proteins from the aggregation caused by posttranslational modifications during a lifespan. Most likely, low levels of myo-inositol in owl lenses should be attributed to their nocturnal lifestyle, so the light-induced protein modifications for these species are less dangerous.
A number of quantified metabolites in bird lenses, on average, have rather high concentrations (myo-inositol, taurine, lactate, glutamine, acetate, glutathione, alanine, ergothioneine, serine, ATP, glutamate, creatine, pyroglutamate, glucose, phosphocholine, valine, ADP, glycine, proline, betaine, methionine, anserine, leucine, UDP and NADH), which emphasizes their significance in important biochemical processes that are genetically adapted to the lifestyle necessities of the species. These high-abundant metabolites are involved in osmotic protection, antioxidant protection and cellular energy generation (glycolysis, the TCA cycle and the urea cycle). It should be noted that a metabolite usually shares several functions and participates in several pathways; thereby, the metabolite role is not limited to the given functions and pathways.
The obtained results show that evolutionary history is a major factor determining the metabolomic composition of a lens. The genetic factor prevails over other factors in sample positioning in the HCA dendrogram: the difference in age, sex, place of catching, etc., for conspecific species influence positioning less than interspecific differences. The following observations also support the predominance of the genetic factor: (A) genetically close bird species have similar concentrations of the most abundant metabolites ( Figure 1 and Table 2, Supplementary Table S2) consideration. Although no examined HCA tree ideally fits into the phylogenetic trees (Supplementary Figure S3), the general phylogenetic dependencies are well-reproduced. Figure 5 shows the comparison of the simplified phylogenetic trees and the dendrograms AEW, NSS and PSS constructed according to the metabolomic data.
Biology 2022, 11, x FOR PEER REVIEW 17 of 22 Figure 5. Comparison of metabolomics-based and "classical" dendrograms. Three HCA dendrograms (AEW, PSS and NSS) and two "classical" dendrograms (the phylogenomic-based tree from Jarvis ED et al. [24] and the phylotranscriptomic-based tree from Kuhl H et al. [27]) were analyzed. The dashed green lilac arrow indicates the absence of Columbea-like clade in HCA dendrograms; green and dark violet arrows indicate the significant mispositioning of C. livia and F. atra, correspondingly; dashed magenta, mint and light green arrows indicate the slight mispositioning of C. frugilegus, P. pica and P. major, correspondingly; dashed aquamarine arrows indicate the mixing of the Corvidae samples in the PSS dendrogram; the orange arrow indicates the absence of the the Afroaves/Higher landbirds-like clade in the NSS dendrogram.
The following features can often be observed for the obtained dendrograms ( Figure  5):    [24] or from the Higher landbirds clade [27], are positioned nearby, and their samples are not mixed with other species. • C. livia and P. cristatus are distant from other clusters.
The following differences between the HCA and phylogenetic trees (shown by arrows in Figure 5) should be mentioned: • A node between C. livia and P. cristatus from the Columbea clade in the Jarvis ED et Figure 5. Comparison of metabolomics-based and "classical" dendrograms. Three HCA dendrograms (AEW, PSS and NSS) and two "classical" dendrograms (the phylogenomic-based tree from Jarvis ED et al. [24] and the phylotranscriptomic-based tree from Kuhl H et al. [27]) were analyzed. The dashed green lilac arrow indicates the absence of Columbea-like clade in HCA dendrograms; green and dark violet arrows indicate the significant mispositioning of C. livia and F. atra, correspondingly; dashed magenta, mint and light green arrows indicate the slight mispositioning of C. frugilegus, P. pica and P. major, correspondingly; dashed aquamarine arrows indicate the mixing of the Corvidae samples in the PSS dendrogram; the orange arrow indicates the absence of the the Afroaves/Higher landbirds-like clade in the NSS dendrogram.
The following features can often be observed for the obtained dendrograms ( Figure 5):   [24] or from the Higher landbirds clade [27], are positioned nearby, and their samples are not mixed with other species.
• C. livia and P. cristatus are distant from other clusters.
The following differences between the HCA and phylogenetic trees (shown by arrows in Figure 5) should be mentioned:

•
A node between C. livia and P. cristatus from the Columbea clade in the Jarvis ED et al. tree [24] does not exist in any HCA dendrogram (dashed green lilac arrow). • F. atra in the HCA dendrogram is positioned close to the Corvides infraorder, most likely indicating the influence of lifestyle on the metabolomic composition of the F. atra eye lens (dark violet arrow); no node between F. atra and C. livia (the Basal landbirds clade from Kuhl H et al. tree [27]) was found in any HCA dendrogram. • Although C. frugilegus and P. pica are well clustered with the Corvides, they have rather incorrect phylogenetic distances within the Corvides. C. frugilegus should be closer to the other species from the Corvus genus (C. corax and C. cornix), and P. pica should be the sister taxon to all Corvus. The samples of P. pica, C. corax and C. cornix often mix together, without the formation of separate clusters for each species (dashed magenta, mint and aquamarine arrows).

•
Similarly, the incorrect positioning is observed for P. major; it should be more distant from the other species of the Passerida parvorder (light green arrow).
These observations support the key role of genomics in the formation of the lens metabolome, but they also indicate the influence of lifestyle. In particular, the F. atra, P. Pica and P. major discrepancies in the positioning in the HCA trees most likely originate from lifestyles and feedings. Moreover, our metabolomics data analysis supports the tree structure proposed by Kuhl H et al. better than that by Jarvis ED et al. due to the absence of the Columbea-like clade.
In the current pilot study, we assessed only built-in parameters from the MetaboAnalyst web-platform for the phylogenetic dendrogram construction. Most likely, choosing other types of metrics or clustering algorithms may yield better results. The further development of methods for metabolomics-based tree construction can most likely demand adaptations of other distance-matrix or non-distance-matrix-based methods that are now widely used in phylogenetics, e.g., the maximum likelihood method or parsimony analysis. In addition, the perfect tree should be stable, and bootstrapping-like methods are also required for the adaptation.

Conclusions
In the current paper, we applied methods of quantitative metabolomics and corresponding statistical approaches for the differentiation between 14 species from 6 orders of the class Aves (Birds) and for further phylometabolomic tree construction. We determined the concentrations of the most abundant metabolites in the eye lenses of the species and deposited the corresponding raw NMR spectra and the metabolomic analysis into our Animal Metabolite Database repository (https://amdb.online (accessed on 10 January 2022)). The most fruitful results were obtained with the hierarchical clustering analysis. The topology of the obtained dendrograms is very similar to the topology of genetics-based trees, and general phylogenetic dependencies are well-reproduced, although none of the HCA dendrograms ideally fit to the genetics-based trees. The HCA dendrogram structure supports the key role of genomics in the formation of the lens metabolome, but it also indicates the influence of lifestyle. Very likely, most discrepancies in the HCA dendrograms as compared to genomics-based phylogenetic trees originate from different lifestyles and feedings.
Perhaps, the addition of metabolomic data for a larger number of bird species into the analysis, as well as the development of HCA methods, can produce a clearer correlation and similarity of the phylogeny of birds with eye lens metabolomics-based hierarchical clustering. A combination of phylogenetic and phylometabolomic analyses can potentially solve issues in the reconstruction of bird phylogeny and can yield a more reliable tree of life. Current methods can be applied to differentiate other species of vertebrates, and with several adaptations to other species of the animal kingdom and even of other kingdoms, the only limitation is that the tissue under analysis is conservative and comparable between all species under analysis.

Supplementary Materials:
The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/biology11071089/s1, Figure S1: Representative 1 H NMR spectrum of bird lens metabolome; Figure S2: PCA scores plots for non-scaled (left panel) and Pareto-scaled (right panel) data; Figure S3: HCA clustering results; Figure S4: HCA dendrograms obtained for nonscaled (left panel, NSS) and Pareto-scaled (right panel, PSS) data; Figure S5: Clustering results shown as a heatmap; Table S1: Concentrations of metabolites in the lenses of 14 bird species; Table S2: Sorted average concentrations of metabolites in 14 bird species in nmol/g, color coded..

Informed Consent Statement: Not applicable.
Data Availability Statement: Raw NMR spectra, the descriptions of specimens and samples, metabolite concentrations and the preliminary metabolomic analysis are available at our Animal Metabolite Database repository, Experiment ID 145 (https://amdb.online/amdb/experiments/145/). All other data are available from the corresponding author upon request.